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Abstract 

Background: Despite improved survival for the patients with diffuse large B-cell lymphoma (DLBCL), the prognosis after 
relapse is poor. The aim was to identify molecular events that contribute to relapse and treatment resistance in DLBCL. 

Methods:\Ne analysed 51 prospectively collected pretreatment tumour samples from clinically high risk patients treated in 
a Nordic phase II study with dose-dense chemoimmunotherapy and central nervous system prophylaxis with high 
resolution array comparative genomic hybridization (aCGH) and gene expression microarrays. Major finding was validated at 
the protein level immunohistochemically in a trial specific tissue microarray series of 70, and in an independent validation 
series of 146 patients. 

Results: We identified 31 genes whose expression changes were strongly associated with copy number aberrations. In 
addition, gains of chromosomes 2p15 and 1 8q1 2.2 were associated with unfavourable survival. The 2p15 aberration 
harboured COMMD1 gene, whose expression had a significant adverse prognostic impact on survival. Immunohistochemical 
analysis of COMMD1 expression in two series confirmed the association of COMMD1 expression with poor prognosis. 

Conclusion: COMMD1 is a potential novel prognostic factor in DLBCLs. The results highlight the value of integrated 
comprehensive analysis to identify prognostic markers and genetic driver events not previously implicated in DLBCL. 
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Introduction 

Diffuse large B-cell lymphoma (DLBCL) is the most common 
lymphoid neoplasm. It is an aggressive lymphoma entity, and only 
50% of patients can be cured with anthracycline-based CHOP 
(cyclophosphamide, doxorubicin, vincristine, and prednisone) or 
CHOP-like chemotherapy. However, following the addition of 
rituximab or etoposide to CHOP, or the administration of CHOP 
dose-densely at two-week intervals (CHOP- 14), response rates and 
survival have significantly improved [1-5]. Despite these advances, 
20-30% of patients experience disease relapses or have primary 
refractory disease. Such patients could benefit from alternative 
therapies if their clinical outcome could be more accurately 



predicted at the time of diagnosis. Therefore, the identification of 
biological prognostic factors that could identify high-risk DLBCL 
patients is a priority. 

Genome-wide molecular profiling has revealed a high degree of 
complexity in DLBCL,and significantly accelerated the under- 
standing of oncogenic mechanisms in lymphomagenesis [6,7]. On 
the basis of gene expression profiling (CEP), DLBCL is classified 
into distinct molecular subtypes [8-11]. Three major DLBCL 
entities, showing germinal center B-cell (GCB), activated B-cell 
(ABC)-like, and primary mediastinal B-cell lymphoma signatures 
have been recognized. Many oncogenic mechanisms distinguish 
GCB and ABC subtypes. For example, chromosomal transloca- 
tions involving BCL2 and the c-REL locus amplification on 
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chromosome 2p occur predominantly in the GCB DLBCLs 
[10,12]. In contrast, ABC DLBCLs are characterized by 
transcriptional overexpression of BCL2 and a constitutive activa- 
tion of the NF-kB signaling pathway [10,13]. According to the 
gene expression based classification, the patients in the molecular 
subgroups also have different outcomes in response to chemo- and 
chemoimmuno therapy [9,10]. 

Over the past few years, progress in molecular genetics and 
sequencing technologies has also revealed several previously 
unrecognized genetic lesions and pathways that are involved in 
DLBCL [14—17]. For example, recurrent mutations inactivating 
histone and/ or chromatin modifying genes, and genes involved in 
immune recognition have been identified. However, despite the 
rapidly growing number of genetic aberrations reported in 
DLBCL, association of these findings with treatment outcome 
remains to be shown. 

We have integrated the information from high-resolution gene 
copy number and expression microarrays to identify the most 
likely "driver gene" candidates associated with DNA copy number 
aberrations (CNAs) and poor prognosis in DLBCL. Importantly, 
with our cohort of high-risk DLBCL patients treated homo- 
genously in a phase II study with dose-dense chemoimmunother- 
apy and systemic CNS prophylaxis, we were able to identify a 
genomic region harbouring a gene that has a survival effect and 
thus is a candidate for a novel molecular marker for poor 
prognosis. 

Materials and Methods 

Ethics Statement 

Written informed consent was obtained prior to treatment and 
sampling from all patients included in the NLG-LBC-04 study. 
Clinical protocol and sampling were approved in the participating 
countries at the national level by Regional Committee on Health 
Research Ethics in Glostrup, Denmark, the Hospital District of 
Helsinki and Uusimaa Regional Committee on Medical Research 
Ethics in Finland, Oslo Regional Committee for Medical and 
Health Research Ethics in Norway, and Lund Regional Ethics 
Committee in Sweden. The trial was registered at ClinicalTrials.- 
gov, number NCTO 1502982. For the retrospectively collected 
validation cohort, approval was obtained from the National 
Authority for Medicolegal Affairs, Finland and Helsinki University 
Central Hospital, Finland. 

Patients 

The prospectively collected screening (aCGH) and tissue 
microarray (TMA) cohorts consisted of DLBCL patients who 
were less than 65 years old and had primary high-risk (age- 
adjusted International Prognostic Index (aalPI) score 2-3) disease. 
They were treated in the Nordic phase II NLG-LBC-04 protocol 
with six courses of R-CHOEP14 (rituximab, cyclophosphamide, 
doxorubicin, vincristine, etoposide, and prednisone supported with 
G-CSF) followed by systemic CNS prophylaxis with one course of 
high-dose methotrexate and one course of high-dose cytarabine 
[18]. The original clinical study included 156 eligible patients. 
Histological diagnosis was established from surgical or needle 
biopsy of the pretreatment tumour tissue by local pathologists 
according to current criteria of the World Health Organization 
classification [19], and subsequendy reviewed by expert hemato- 
pathologists on a national basis. The patient selection for this 
molecular study was based on availability of fresh frozen tissue 
containing adequate material for DNA extraction and aGCH 
(screening cohort; n = 51), and for RNA extraction and gene 
expression profiling (n = 38). The infiltration of lymphoma cells in 



the tissue was assessed from frozen tissue section using hematox- 
ylin eosin and toluidine blue stainings. Formalin-fixed paraffin- 
embedded (FFPE) tissue containing adequate material was used for 
the preparation of TMAs (TMA cohort; n = 70). 

To validate the findings, an independent series of 146 primary 
DLBCL patients treated with chemoimmunotherapy at the 
Helsinki University Central Hospital between 2001 and 2010 
was used. The patients were treated with R-CHOP (n= 126), R- 
CHOEP (n = 1 1) or other regimen (n — 9). The cases were selected 
based on the availability of FFPE tissue and clinical information. 

Samples 

RNA and DNA were extracted with Qiaqen AllPrep DNA/ 
RNA/Protein Mini kit. CNAs were analysed from the DNA of 5 1 
tumour samples hybridized onto Agilent Human (4x) 180 K 
CGH arrays. Tumour samples from 38 patients were eligible for 
mRNA analyses using Affymetrix Human Exon 1 .0 ST arrays. All 
hybridizations were performed at the Biomedicum Genomics 
(University of Helsinki) according to manufacturer's instructions. 
Hybridization protocols and raw expression microarray data are 
available at ArrayExpress archive http://www.ebi.ac.uk/ 
arrayexpress/experiments/E-MEXP-3488 and http://www.ebi. 
ac.uk/arrayexpress/experiments/E-MEXP-3463. All tissue sam- 
ples were collected before treatment. 

qRT-PCR 

Expression of the COMMD1 (Hs04190004_ml) and XPOl 
(Hs00418963_ml) genes were validated by quantitative real-time 
polymerase chain reaction (qRT-PCR) using TaqMan Gene 
Expression Assays (Assays-On-Demand, Applied Biosystems) and 
the ABI Prism 7500 Fast Sequence Detection System (Applied 
Biosystems) for 24 available tumour samples. Normalization for 
the quantity of DNA was done by performing simultaneous qRT- 
PCR for GAPDH (TaqMan Pre-Developed Assay Reagents, 
Applied Biosystems). Each assay was determined by a comparative 
cycle threshold method, using the arithmetic formula provided by 
the manufacturer. All assays were performed in triplicate. 

Subgroup Classification by Gene Expression Profiling 

Samples involved in exon array analysis were divided into 
germinal centre B-cell (GCB) and non-GCB subgroups by gene 
expression profiling (GEP). Briefly, we utilized the log ratios of 44 
genes from the gene expression panel by Wright et al. [20] and 
agglomerative hierarchical clustering (complete linkage) to divide 
samples into two subgroups. The IGHA I gene was not on our array 
and was subsequently dropped from the analysis. Following the 
Wright classification [20], samples in one main branch of the 
resulting cluster tree were categorized as GCB and samples in the 
other branch as non-GCB DLBCLs. In addition, all samples were 
classified into GCB and non-GCB phenotypes immunohistochem- 
ically (IHC) according to Hans algorithm [21]. 

Immunohistochemical Analyses of COMMD1 

IHC stainings were performed on FFPE tissue sections on TMA 
slides containing 2-4 tissue cores/ patient, with a core diameter of 
1 mm (TMA cohort), or whole tissue sections (independent 
validation cohort). After deparaffinization, heat- induced epitope 
retrieval (121°C, 3 min), and blocking of endogenous peroxidase, 
the slides were incubated with anti-COMMDl antibody (1:200, 
Sigma-Aldrich, Prestige Antibodies) at 4°C overnight. Staining 
was completed with Vectastain ABC kit reagents (Vector 
Laboratories) according to the manufacturer's instructions, and 
slides were counterstained with hematoxylin. 
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To score the stainings, COMMDl-positivity was evaluated 
from one to three high-power fields (hpf; x630 magnification) with 
the Leica DM LB bright-field microscope (Leica Microsystems 
GmbH) and a camera attached to it (Olympus DP50, InStudio 
1.0.1 Software). The most representative areas with intense 
staining pattern were first selected with low magnification and 
further digitized with hpf, resulting in microscopic images with 
area size of 0.02 mm 2 . Images were subsequently scored using 
computerized image analysis system [22]. All scorings were 
performed blindly. 

Quantitative image analyses were performed using Anduril 
[22]. The colour space of each image was categorized to four 
expected representative colour classes: Brown, blue, white and 
background. The background class included faint brown and blue 
colours considered to be unspecific staining. The colour values 
were selected by pointing at 15 example colours for each class. All 
images were subjugated to use the same class specification. Each 
pixel was assigned to a class by finding the nearest example colour 
value, and a final staining coverage calculated from the area of 
each class present in each image. 

Data Analysis 

In copy number profiling the aCGH data were first normalized 
with locally weighted scatterplot smoothing (LOWESS) normal- 
ization (four iterations, 30% window) and the data were denoised 
with circular binary segmentation (/><0.05, split undo = none) 
[23]. The background noise level of copy number arrays was 
estimated by calculating the median probe signal of all arrays. An 
aberration was called significant if it was two standard deviations 
apart from the median. We identified minimal common CNA 
regions in which a CNA overlapped in 10% of the samples with 
both transcriptome and copy number data, and subsequent 
analysis were restricted to these regions. The Database of 
Genomic Variants (DGV, version 10, Nov 2010) [24] was used 
to determine locus specific copy number variants (CNVs). In short, 
for each gene in our CNA regions, the number of overlapping 
CNVs in the DGV according to genomic locus was counted. 
Genes with more than 10 CNVs were excluded from the 
subsequent analyses. Exon array expression data were normalized 
and transformed to gene expression level data by the Multiple 
Exon Array Preprocessing (MEAP) algorithm [25]. 

In order to find genes with a significant association of expression 
and CNA, all genes in the expression data, which were located in a 
minimal common CNA region, were first matched with their 
respective segmented copy number values. The samples were split 
into two groups based on their CNA status, separately for gains 
and losses, and a signal-to-noise statistic on the expression of each 
gene was calculated [26]. A p-value for each signal-to-noise score 
was calculated with a permutation test. Gain or deletion was 
associated with expression for genes that displayed up- or 
downregulation in CNA samples but stable expression in non- 
CNA samples (/><0.05). Moreover, only genes that exhibited CNA 
in at least three patients were analysed. 

A Chi square test was performed to evaluate the differences in 
the frequency for the prognostic factors. Categorical data were 
compared using the Fisher's exact test (two-sided). Pearson 
correlation coefficient was calculated to evaluate the correlation 
between the expression values from microarray and qRT-PCR 
analyses. All genes with altered copy number levels in at least five 
samples were analysed for patient survival. Survival curves with 
corresponding /)-values were calculated using Kaplan-Meier 
analysis with the log-rank test. Receiver operating characteristic 
(ROC) curve analysis was used to determine the ideal cutoff values 
for survival outcomes. Univariate analyses were performed 



according to the Cox proportional hazards regression model. 
The progression-free survival (PFS) was calculated as the period 
between the dates of registration and lymphoma progression or 
relapse. Otherwise, the patients were censored at the last date of 
follow-up. Patients in remission were censored at the last date they 
were known to be alive. Patients who died due to causes other than 
lymphoma were censored at the date of death. Overall survival 
(OS) was calculated as a period between registration date and date 
of death. Surviving patients were censored at the last date they 
were known to be alive. Lymphoma-specific OS was calculated as 
a period between registration date and the date of death due to 
lymphoma. P-values less than 0.05 were considered to indicate 
statistical significance. Data analyses were done with the compu- 
tational framework Anduril [22], which is designed for systematic 
integration, analysis and result interpretation of large-scale 
molecular data, and with IBM SPSS Statistics 20.0. 

Results 

Clinical Characteristics of the Screening Cohort 

The baseline characteristics of the screening (aCGH) cohort of 
51 patients treated in the NLG-LBC-04 protocol [18] are shown in 
Table 1. Median age of the patients was 55 years (range, 20-65 
years). The overall response rate (ORR; complete response (CR)+ 
partial response (PR)) in this study population was 98%. Median 
follow-up of the patient cohort was 55 months (range 31-101 
months), 1 5 patients had relapsed, three experienced CNS relapse 
and 1 2 had died. Five of the deaths were not lymphoma-related. 
Predicted 5-year PFS was 69%, lymphoma-specific OS 85%, and 
OS 76%. 

Gene Copy Number Aberrations 

Genome-wide copy number analysis of 5 1 lymphoma samples 
with aCGH revealed several gains and losses. All patients had at 
least one abnormality with an average number of 17.5±9.8 CNAs 
per patient. The most frequendy (> 10%) altered regions as well as 
their frequencies and possible target genes are shown in Table 2. 
Some samples exhibited narrower alterations than others, which 
caused the small variation in the CNA frequencies. Of the 
recurrent CNAs previously reported in DLBCL [27-30], gains in 
3q, 7q22.1, and 19pl3, and loss in 6q were observed in 9.8% of 
the patients. 

The association of CNAs with molecular subgroups is summa- 
rized in Table SI in File SI. Notably, GCB type DLBCL was 
characterized by more frequent gain of 2pl5 and 2p 1 6. 1 including 
the well-known proto-oncogenes REL and BCL1 1A as compared to 
non-GCB DLBCL (15-19% vs. 5%, p = m). Instead, the most 
frequently altered genomic regions in the non-GCB DLBCL 
subgroup in comparison to GCB DLBCL patients were gains of 
18ql2.2 (20% vs. 4%, p = m) and 18q23 (20% vs. 4-7%, p = ns), 
and loss of 9p21.3 (30-35% vs. 7%, /> = 0.026). Of the other 
genomic imbalances (<10% of all patients), only the gain of 
18q2 1.2-33 in the non-GCB subgroup was significantly more 
frequent when compared to GCB subgroup (20% vs. 0%, 
p = 0.027). 

Copy Number Associated Gene Expression Changes 

To identify genes with altered expression due to large genomic 
aberrations, we combined the CNA and gene expression data 
obtained from 38 patients for whom both data sets were available. 
Our analysis showed that copy number gains and losses of 31 
genes were associated with a simultaneous and significant increase 
or decrease in gene expression. The majority (n — 29) of the genes 
were over-expressed due to copy number gains in chromosomes 2 
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Table 1. Patient characteristics in the screening cohort. 







All n (%) 


51 (100) 


Gender 


Female 


19 (37) 




Male 


32 (63) 


Age 


Median (range) 


55 (20-65) 




<60 


35 (69) 




60-65 


16 (31) 




>65 


0 (0) 


Histology 


GCB 


27 (53) 




Non-GCB 


20 (39) 




Other/Unclassified 


4 (8) 


Performance status 


0-1 


33 (65) 




2-3 


18 (35) 


B-symptoms 




31 (61) 


Elevated LDH 




49 (96) 


Stage 


l-ll 


1 (3) 




III— IV 


37 (97) 


aalPI 


0 


0(0) 




1 


0(0) 




2 


36 (71) 




3 


1 5 (29) 


mRNA analysis 




38 (75) 



doi:1 0.1 371/journal.pone.0091 031 .t001 



(2pl5 and 2pl6.1) and 18 (18q21.2, 18q21. 31-33, 18q23). In 
contrast, two genes were suppressed and located in regions of copy 
number losses at 9p2 1 .3. The aforementioned CNA areas and 
target genes are presented in more detail in Table S2 in File SI 
and at http:/ /csbi.ltdk.helsinki.fi/pub/lymphoma. As an example, 
the patients with 2pl5 amplification had elevated COMMD1 
expression with a p-value and fdr <0.001. 

Prognostic Significance of Chromosomal Alterations 

In the whole series of 5 1 patients with the aGCH data, we found 
two chromosomal regions with genomic alterations associated with 
PFS and lymphoma-specific OS. Patients with amplification in 
chromosome 2pl5 (n = 6; 12% of all patients) had inferior PFS in 
comparison to patients without this gain (p = 0.010; Figure 1A). In 



addition a non-significant difference towards poor lymphoma- 
specific OS was observed (p = 0.131; Figure IB). Similarly, patients 
with amplification in 18ql2.2 (« = 6) had worse PFS (/> = 0.044) 
and lymphoma-specific OS (/)<0.001) than patients without this 
gain (Figures 1C and ID). The survival associated gain in 2p 1 5 
contained the genes B3GJVT2 (UDP-GlcNAc:betaGal beta-l,3-N- 
acetylgiucosaminyltransferase 2), FAAI161A (family with sequence 
similarity 161, member A), CCT4 (chaperonin containing TCP1 
subunit 4), COMMD1 (copper metabolism (Murrl) domain 
containing 1), and XPOl (exportin 1), and the amplification was 
associated with their over-expression (Figure 2). Association of 
COMXID1 and XPOl over-expression with 2p 1 5 amplification was 
further confirmed by qRT-PCR (Figure SI in File SI). The 
amplification in 18ql2.2 contained the CELF4 gene but we found 



Table 2. Genome-wide overview of recurrent gains and losses. 




Band 


Gain Freq % 


Loss Freq % 


Position (Mb) 


Possible target genes 


1q24.2 


12 


8 


167.69-167.76 


MPZL1 


1q44 


20 


6 


247.00-247.10 


AHCTF1 


2p16.1-p15 


12 


NA 


60.68-63.27 


BCL11A, PAPOLG, REL, PUS10, PEX13, KIAA1841, AHSA2, USP34, 
XPOl, FAM161A, CCT4, COMMD1, B3GNT2, TMEM17, EHBP1 


9p21.3 


NA 


14-18 


21.80-22.01 


MTAP, CDKN2A, C9orf53, CDKN2B 


14q11.2 


2 


12 


22.938-22.939 


TRDV3 


18q12.2 


12 


NA 


34.82-35.15 


CELF4 


18q21.1 


12 


4 


44.06-44.34 


LOXHD1, ST8SIA5 


18q23 


12-16 


4 


77.62-77.71 


KCNG2, PQLC1 


20q11.22 


16 


NA 


33.13-33.15 


MAP1LC3A 





doi:1 0.1 371/journal.pone.0091 031 .t002 
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Figure 1. Survival of DLBCL patients according to genomic aberrations. PFS (A and C) and lymphoma-associated OS (B and D) rates 
according to indicated genomic aberration. 
doi:1 0.1 371 /journal.pone.0091 031 .g001 



no correlation between the gain and CELF4 gene expression 
(Figure 2). 

Consistent with previous studies on lymphomas [27,28,31], we 
identified REL and BCL11A located at 2pl6.1, and BCL2 at 
1 8q2 1 .3 being among the genes, whose expression was linked with 
copy number gains (Table S2 in File SI). CDKN2A and MTAP 
genes, which have also been described in lymphomas, specifically 
in the chemoresistant and ABC type DLBCLs [27,28,30,32,33]}, 
were in turn located in the regions of copy number losses at 
9p21.3. However, these genomic alterations were not associated 
with survival in our study population. 

To further identify biomarker candidates located in the survival 
associated 2pl5 amplification locus, we performed survival 
analysis for five genes whose expression values correlated with 
amplification. Using gene expression values as continuous 
variables in Cox univariate analyses, only C0MMD1 expression 
was identified to have prognostic impact on PFS (p = 0.037). When 
Kaplan-Meier analysis was performed, patients with high 
COMMD1 expression had significantly inferior PFS as compared 
to patients with low expression (5-year PFS 65% vs. 100%, 
p = 0.033; Figure 3A). Association of C0MMD1 expression with 
the survival was further validated using qRT-PCR and Cox 
univariate analysis with continuous variables [p = 0.009) and 
Kaplan-Meier analysis with categorical data (p = 0.031, 



Figure 3B). In comparison, the expression of XPOl, another 
selected gene for qRT-PCR validation, was not significandy 
associated with survival (J> — 0.345). Correlation coefficients 
between the expression arrays and qRT-PCR were 0.641 (p< 
0.001) for C0MMD1 and 0.494 (0 = 0.037) for XPOl. 

COMMD1 Protein Expression is Associated with Outcome 

Considering that our multi-level analysis revealed C0MMD1 to 
be amplified, over-expressed and survival associated gene in 
DLBCL, we extended COMMD1 analyses to the protein level. 
IHC stainings were performed on a TMA consisting of 70 
lymphoma samples from the patients treated in the NLG-LBC-04 
protocol (Table 3). Overall, intensity of COMMD1 positivity was 
highly variable (Figure 4A-B). COMMD1 immunoreactivity was 
primarily localized as perinuclear, granular, cytoplasmic pattern in 
lymphoma cells (Figure 4B), but also in endothelial cells and 
macrophages with more uniform cytoplasmic staining pattern. 

The prognostic significance of COMMD1 expression and 
correlation with mRNA data were assessed by computerized 
image analysis of COMMD1 positivity in the tumour tissue. In the 
univariate analysis the increasing COMMD1 positivity was an 
adverse prognostic factor for PFS (p = 0.003). The cutoff point for 
survival outcomes was selected by ROC curve analysis, resulting in 



PLOS ONE | www.plosone.org 



5 



March 2014 | Volume 9 | Issue 3 | e91031 



Integrative Genomic Profiling in DLBCL 




no CNA (n=33) 



CNA (n=5) 



no CNA (n=33) 



CNA (n=5) 



c t 
<u 

CD 



2p15 



18q12.2 



no CNA (n=33) 



CNA (n=5) 



no CNA (n=34) 



1 

CNA (n=4) 



Figure 2. Expression of genes associated with amplifications in 2p15 and 18q12.2 locuses. Boxes contain expression values between the 
25* and 75 th percentile in the tumour subgroup. The extremes denoted by asterisks represent maximum and minimum expression values. 
doi:1 0.1 371 /journal.pone.0091 031 .g002 



a staining coverage of 8.9% being the most discriminative value 
(median 7.3%, range 0—24%), with an area under the curve (AUC) 
value of 0.663 (95% CI 0.516-0.810, p = 0.027). In Kaplan-Meier 
analyses, the patients with high COMMD1 expression had a 
significantly worse PFS and a trend towards adverse lymphoma 
associated OS in comparison to the remaining patients with lower 
COMMD1 expression (5-year PFS 47% vs. 79%, p = 0.005 
(Figure 4C) and 5-year OS 75% vs. 90%, / = 0.081). According 
to COMMD1 expression, the relative risk of relapse was 3.2 (95% 
CI 1.361-7.584, p = 0.008) and death 2.825 (95% CI 0.837-9.536, 
p = 0.094). In multivariate analysis with aalPI, COMMD1 
expression retained its prognostic value on PFS (RR 2.996; CI 
1.210-7.418, p = 0.018). When clinical characteristics of the 
patients were compared according to COMMD1 expression, no 
differences in gender, subtype, age, LDH level or stage were 
observed between the subgroups (Table 3). However, low 



COMMD1 expression was associated with low performance 
status. 

We also examined prognostic impact on COMMD1 expression 
according to molecular subtype, and found that a significant 
adverse prognostic impact of COMMD1 expression was restricted 
to the GCB subgroup (Fig. 4D; p = 0.029). The relative risk of 
relapse according to COMMD1 expression within the GCB 
subgroup was 3.434 (95% CI 1.056-11.164, p = 0.040). Overall, 
immunohistochemically defined molecular subgroup was not 
associated with survival. 

In contrast to the results from genomic and transcriptomic 
levels, no correlation was found between COMMD1 protein levels 
and CNA or gene expression data (r = 0.236). The observation 
suggests that post-transcriptional mechanisms may be involved in 
the regulation of COMMD1 protein levels in DLBCL. 
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Figure 3. PFS according to COMMD1 expression. A. PFS according to exon array based COMMD1 expression values. B. PFS according to 
quantitative PCR analysis based COMMD1 expression values. In both A and B, the ideal cutoff values have been calculated using ROC curve analyses. 
In A the estimated area under the curve (AUC) was 0.717 (p = 0.063, 95% CI 0.531-0.903). In B the AUC was 0.759 (p = 0.062, 95% CI 0.468-1.000). 
doi:1 0.1 371 /journal.pone.0091 031 .g003 



COMMD1 Expression in an Independent DLBCL Series 

In order to further validate the importance of COMMD1 in 
DLBCL, we analysed the prognostic significance of COMMD1 
expression in a larger independent cohort of 146 DLBCL patients 
treated with chemoimmunotherapy (Table 3). Median age of the 
whole cohort was 63 years (range, 16-84 years), median follow-up 
64 months (range 20-133 months), predicted 5-year PFS 74%, 
lymphoma-specific OS 79%, and OS 71%. While high IPI score 



was a strong predictor for survival (p<0.001), immunohistochem- 
ically defined molecular subgroup was not associated with 
outcome. 

The clinical features of the patients according to COMMD1 
expression (low versus high, cut-off defined according to TMA 
cohort) are summarized in Table 3. Accordingly, no differences 
were observed between COMMD1 low and high subgroups. In 
Kaplan-Meier analyses, PFS at five years for the patients with high 
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Figure 4. COMMD1 protein expression and outcome. Representative examples of low (A) and high (B) expression levels of COMMD1 in FFPE 
DLBCL tissue (original magnifications 1 00 x, and 400 x). C-D. Outcome according to COMMD1 expression in the trial specific TMA cohort. PFS in in the 
whole TMA cohort (C) and in the GCB subgroup (D). E-F. Outcome according COMMD1 expression in the validation cohort. PFS in in the whole 
validation cohort (E) and in the GCB subgroup (F). The cutoff point (staining coverage of 8.9%) for survival outcomes (COMMD1 low vs high) was 
selected by the ROC curve analysis in the training set, and then applied also to validation cohort. 
doi:1 0.1 371 /journal.pone.0091 031 .g004 
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COMMD1 expression was 64% compared with 79% for those 
with lower expression levels (p = 0.034; Figure 4E). When adjusted 
for IPI, COMMD1 expression maintained its prognostic effect on 
PFS (p = 0.023). In addition, when adjusted for age (<60 vs. >60), 
COMMD1 expression remained predictor for PFS (p = 0.035). 
According to COMMD1 expression, the relative risk or relapse 
was 1.9 (95% CI 1.040-3.606, p= 0.037). In multivariate analysis 
with IPI, COMMD1 retained its prognostic value on PFS (RR 2.0; 
95% CI 1.037-3.730,^ = 0.038). 

Finally, when adjusted for molecular subgroups, COMMD1 
expression was marginally predictive for PFS (ft = 0.066). When 
COMMD1 -related PFS was analysed separately for the patients in 
different molecular subtypes, there was a non-significant difference 
in PFS between COMMD1 high and low subgroups in the GCB 
DLBCLs (Fig. 4F; p = NS). Collectively, the results in our two 
independent cohorts provide evidence that COMMD1 is a novel 
survival-associated marker in DLBCLs. 

Discussion 

Although the addition of rituximab to chemotherapy has 
considerably improved the survival rates of DLBCL, the patients 
with high IPI scores still have a poor prognosis. Here we have 
studied genomic alterations and gene expression in freshly frozen 
lymphoma tissue collected prospectively from patients included in 
a Nordic phase II study for young high-risk DLBCL patients. With 
this comprehensive approach we have identified copy number 
gain at 2pl5 driving COMA1D1 mRNA upregulation with impact 
on survival. The gain was more frequently seen in the GCB than 
ABC DLBCLs. Furthermore, we have validated the results at the 
protein level by IHC in the same patient cohort as well as in an 
independent larger series. These data show that COMMD1 is a 
novel biomarker candidate that may be useful in improving risk 
stratification for DLBCL patients. 

In a previous prospective study of poor prognosis DLBCL 
patients treated uniformly with dose-escalated CHOP followed by 
high-dose chemotherapy with autologous stem cell transplanta- 
tion, the amplification of 2p 16. 1 was suggested to be associated 
with poor treatment response [34]. Candidate genes mapped 
within this amplification include proto-oncogenes REL and 
BCL11A, with increased expression levels [31,35]. However, the 
functional correlation of this amplification event has not been 
illustrated [36]. Although REL and BCL11A expression levels were 
integrated with chromosomal amplification in our cohort, they 
were not associated with survival. Instead, we observed that the 
increased copy number in region 2p 1 5 was associated with adverse 
PFS and lymphoma-specific OS. The region covers five genes, 
CCT4, XPOl, COMMD1, FAM161A and B3GNT2, which are aU 
potentially important regulators of cellular growth. None of the 
genes, however, have previously been associated with DLBCL 
biology. While further investigation of the roles of these genes in 
DLBCL pathogenesis is needed, our results demonstrate that 
COMMD1 is a candidate genetic prognostic biomarker in DLBCL. 

COMMD1 is a pleiotropic factor that participates in multiple 
processes, including copper metabolism, sodium excretion, 
inflammatory responses, and adaptation to hypoxia [37]. Recent 
mechanistic studies have revealed that COMMD1 suppresses NF- 
kB- and HIF-mediated gene expression [38,39]. COMMD1 is 
underexpressed in some carcinomas, and low COMMD1 expres- 
sion has been associated with inferior clinical outcome in patients 
with endometrial cancer [39] . In lymphomas, the prognostic role 
of COMMD1 has not previously been established. In the 
Oncomine database [40] the expression of COMMD1 is increased 
in lymphomas in comparison to other cancers [41]. Furthermore, 



two independent studies showed that COMMD1 expression is 
higher in DLBCLs than follicular lymphomas [41,42], whereas no 
differences in the COMMD1 expression were observed between 
molecular subtypes of DLBCL [9] . At the present time, it remains 
unclear how COMMD1 is involved in a variety of seemingly 
unrelated and even opposite cellular activities. However, in most 
instances including lymphomas, the mechanism is likely via 
protein-protein interactions and ubiquitination [39,43,44]. 

To confirm whether COMMD1 expression could be useful to 
recognize DLBCLs with a more aggressive clinical course, we 
studied COMMD1 protein expression by IHC, which is a method 
that can be easily incorporated into a routine diagnostic approach. 
The predictive value of COMMD1 positivity was first defined in a 
training cohort of clinically high-risk DLBCL patients, and 
subsequendy confirmed in an independent, larger and more 
heterogeneous DLBCL cohort. Thus, COMMD1 expression 
seems to represent a potential novel prognostic marker preferen- 
tially in the GCB type molecular subgroup. 

Recently Monti and colleagues investigated gene expression and 
copy number data in 168 DLBCL patients, with the focus on the 
role of p53/cell cycle pathway in patient survival [28]. Interest- 
ingly, even though they reported that the region harboring 
COMMD1 is the second highest region in their cohort, and 
COMA1D1 among the top genes in the region (Table IS in their 
publication [28]), they did not study its survival association. 
Additionally, our findings are supported by reports showing 
amplification of 2pl5-pl6 with the concordant elevated gene 
expression in DLBCL [28,31,34,35]. These indicate that while 
integration of copy number and expression data is known to be a 
powerful approach to find driver genes, carefully selected, 
homogenous patient cohort together with integrative analysis 
can produce clear and important findings that may not be evident 
in more heterogeneous cohorts. 

Since CNA status was integrated with gene expression data, 
only the genes, whose expression correlated with the CNAs were 
identified. The results from the qRT-PCR also correlated with the 
exon array data. However, we found no correlation between 
COMMD1 protein and gene expression levels. The reason for this 
difference is currently unknown but based on the literature 
indicating a strong regulatory role for the processes downstream of 
transcription [45,46], it is plausible to speculate that post- 
transcriptional mechanisms may have a role in the regulation of 
COMMD1 protein levels in DLBCL. The work demonstrating 
that COMMD 1 cellular levels are tightly controlled by ubiquitina- 
tion [47] provides additional evidence that the regulatory level 
may be posttranslational. Together with the CNA data the results 
indicate that COMMD 1 expression is regulated at multiple levels. 

Amplification in 1 8ql 2.2 was found to be another significant 
CNA associated with inferior outcome. The gain has not been 
previously associated with survival in DLBCL, but its deletion has 
been reported to correlate with poor outcome in colorectal 
carcinoma. The CNA was found to harbour a single gene, 
CELF4 1 CUGBP, coding for a member of a family of RNA binding 
proteins playing an essential role in post-transcriptional gene 
regulation. However, despite the gain in 18ql2.2 locus we were 
not able to demonstrate the over-expression of the CELF4 gene. 
While more work is needed to establish the exact role of the 
18ql2.2 gain in DLBCL, our data supports the prognostic 
importance of this region in DLBCL. 

Consistent with previous studies [27,30,32,33], expression of 
CDKN2A was associated with deletion of 9p2 1.3 in our patient 
cohort, and especially in the non-GCB subtype. However, overall 
incidence of the deletion was lower and no correlation with 
survival was found. Considering that aCGH and exon arrays were 
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not performed on purified tumour samples it is possible that the 
presence of background material (tumour infiltrating non-malig- 
nant cells) could to some extent dilute tumour specific genetic 
alterations and explain a lower incidence of 9p2 1 deletions in our 
material. However, a more likely explanation for the differences in 
the results between different studies is that clinical and histopath- 
ological features of the study populations are not identical. The 
differences in the treatments may also contribute. 

In conclusion, we have integrated copy number alteration and 
transcriptomic data in a carefully chosen high-risk DLBCL patient 
cohort to identify biological markers that could be used in risk 
stratification. We found two profiles with increased copy number 
of genes in chromosomes 2pl5 and 1 8ql 2.2 that predicted a poor 
outcome for a subgroup of DLBCL patients. Furthermore, we 
identified a novel potential genetic driver event with prognostic 
significance. Notably, the prognostic impact of COMMD1 on 
survival was also observed at the protein level. The strengths of our 
study are a prospectively collected and homogenously treated 
study population, the availability of copy number, gene expression 
and IHC data from the same patients, the possibility to correlate 
the findings with clinical outcome, and validate the findings in an 
independent cohort of DLBCL patients. Our results demonstrate 
that it is possible to use relatively small but carefully designed 
prospective cohorts as a hypothesis generating material to identify 
a list of putative targets, and then validate and extend the major 
results to the protein level. Taken together, the results presented 
herein are promising and novel, and emphasize the importance of 
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